Nonlinear Dynamics of Nonsynonymous (dN) and Synonymous (dS) Substitution Rates Affects Inference of Selection
نویسندگان
چکیده
Selection modulates gene sequence evolution in different ways by constraining potential changes of amino acid sequences (purifying selection) or by favoring new and adaptive genetic variants (positive selection). The number of nonsynonymous differences in a pair of protein-coding sequences can be used to quantify the mode and strength of selection. To control for regional variation in substitution rates, the proportionate number of nonsynonymous differences (d(N)) is divided by the proportionate number of synonymous differences (d(S)). The resulting ratio (d(N)/d(S)) is a widely used indicator for functional divergence to identify particular genes that underwent positive selection. With the ever-growing amount of genome data, summary statistics like mean d(N)/d(S) allow gathering information on the mode of evolution for entire species. Both applications hinge on the assumption that d(S) and mean d(S) (approximately branch length) are neutral and adequately control for variation in substitution rates across genes and across organisms, respectively. We here explore the validity of this assumption using empirical data based on whole-genome protein sequence alignments between human and 15 other vertebrate species and several simulation approaches. We find that d(N)/d(S) does not appropriately reflect the action of selection as it is strongly influenced by its denominator (d(S)). Particularly for closely related taxa, such as human and chimpanzee, d(N)/d(S) can be misleading and is not an unadulterated indicator of selection. Instead, we suggest that inconsistencies in the behavior of d(N)/d(S) are to be expected and highlight the idea that this behavior may be inherent to taking the ratio of two randomly distributed variables that are nonlinearly correlated. New null hypotheses will be needed to adequately handle these nonlinear dynamics.
منابع مشابه
Uncorrected Nucleotide Bias in mtDNA Can Mimic the Effects of Positive Darwinian Selection
The relative rates of nucleotide substitution at synonymous and nonsynonymous sites within protein-coding regions have been widely used to infer the action of natural selection from comparative sequence data. It is known, however, that mutational and repair biases can affect rates of evolution at both synonymous and nonsynonymous sites. More importantly, it is also known that synonymous sites a...
متن کاملUnbiased Estimate of Synonymous and Nonsynonymous Substitution Rates with Nonstationary Base Composition
The measurement of synonymous and non-synonymous substitution rates (dS and dN) is useful for assessing selection operating on protein sequences or for investigating mutational processes affecting genomes. In particular, the ratio dNdS is expected to be a good proxy for ω, the ratio of fixation robabilities of non-synonymous mutations relative to that of neutral mutations. Standard methods for ...
متن کاملThe Influence of Selection for Protein Stability on dN/dS Estimations
Understanding the relative contributions of various evolutionary processes-purifying selection, neutral drift, and adaptation-is fundamental to evolutionary biology. A common metric to distinguish these processes is the ratio of nonsynonymous to synonymous substitutions (i.e., dN/dS) interpreted from the neutral theory as a null model. However, from biophysical considerations, mutations have no...
متن کاملLikelihood ratio tests for detecting positive selection and application to primate lysozyme evolution.
An excess of nonsynonymous substitutions over synonymous ones is an important indicator of positive selection at the molecular level. A lineage that underwent Darwinian selection may have a nonsynonymous/synonymous rate ratio (dN/dS) that is different from those of other lineages or greater than one. In this paper, several codon-based likelihood models that allow for variable dN/dS ratios among...
متن کاملOne-rate models outperform two-rate models in site-specific dN/dS estimation
Methods that infer site-specific dN/dS, the ratio of nonsynonymous to synonymous substitution rates, from coding data have been developed primarily to identify positively selected sites (dN/dS > 1). As a consequence, it is largely unknown how well different inference methods can infer dN/dS point estimates at individual sites. In particular, dN/dS may be estimated using either a one-rate approa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 1 شماره
صفحات -
تاریخ انتشار 2009